Search CORE

A Dynamic Noise Level Algorithm for Spectral Screening of Peptide MS/MS Spectra

Author: DN Perkins
H Xu
H Xu
H Xu
Hua Xu
I Sures
JE Elias
JWH Wong
K Flikka
LW Zhang
LY Geer
M Bern
Michael A Freitas
R Aebersold
R Craig
RE Moore
RG Sadygov
S Purvine
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background High-throughput shotgun proteomics data contain a significant number of spectra from non-peptide ions or spectra of too poor quality to obtain highly confident peptide identifications. These spectra cannot be identified with any positive peptide matches in some database search programs or are identified with false positives in others. Removing these spectra can improve the database search results and lower computational expense. Results A new algorithm has been developed to filter tandem mass spectra of poor quality from shotgun proteomic experiments. The algorithm determines the noise level dynamically and independently for each spectrum in a tandem mass spectrometric data set. Spectra are filtered based on a minimum number of required signal peaks with a signal-to-noise ratio of 2. The algorithm was tested with 23 sample data sets containing 62,117 total spectra. Conclusions The spectral screening removed 89.0% of the tandem mass spectra that did not yield a peptide match when searched with the MassMatrix database search software. Only 6.0% of tandem mass spectra that yielded peptide matches considered to be true positive matches were lost after spectral screening. The algorithm was found to be very effective at removal of unidentified spectra in other database search programs including Mascot, OMSSA, and X!Tandem (75.93%-91.00%) with a small loss (3.59%-9.40%) of true positive matches.</p

Protein comparison at the domain architecture level

Author: AK Bjorklund
Byungwook Lee
C Chothia
C Vogel
CP Ponting
Doheon Lee
H Tordai
JH Fong
K Lin
LY Geer
M Balestre
M Punta
MK Basu
MK Basu
N Song
N Song
P Glenisson
S Yu
SF Altschul
V Hollich
Publication venue: BioMed Central
Publication date: 03/12/2009
Field of study

ProtQuant: a tool for the label-free quantification of MudPIT proteomics data

Author: A Keller
B Nanduri
B Nanduri
Bindu Nanduri
D Hua
DN Perkins
E Durr
G Bryce Magee
H Liu
I Scheel
II Stewart
J Gao
JE Elias
JE Elias
JK Eng
JK Eng
L Florens
LY Geer
MP Washburn
Nan Wang
PL Ross
SE Ong
Shane C Burgess
SP Gygi
Susan M Bridges
W Paul Williams
Y Ishihama
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background Effective and economical methods for quantitative analysis of high throughput mass spectrometry data are essential to meet the goals of directly identifying, characterizing, and quantifying proteins from a particular cell state. Multidimensional Protein Identification Technology (MudPIT) is a common approach used in protein identification. Two types of methods are used to detect differential protein expression in MudPIT experiments: those involving stable isotope labelling and the so-called label-free methods. Label-free methods are based on the relationship between protein abundance and sampling statistics such as peptide count, spectral count, probabilistic peptide identification scores, and sum of peptide Sequest XCorr scores (ΣXCorr). Although a number of label-free methods for protein quantification have been described in the literature, there are few publicly available tools that implement these methods. We describe ProtQuant, a Java-based tool for label-free protein quantification that uses the previously published ΣXCorr method for quantification and includes an improved method for handling missing data. Results <it>ProtQuant </it>was designed for ease of use and portability for the bench scientist. It implements the ΣXCorr method for label free protein quantification from MudPIT datasets. <it>ProtQuant </it>has a graphical user interface, accepts multiple file formats, is not limited by the size of the input files, and can process any number of replicates and any number of treatments. In addition,<it>ProtQuant </it>implements a new method for dealing with missing values for peptide scores used for quantification. The new algorithm, called ΣXCorr*, uses "below threshold" peptide scores to provide meaningful non-zero values for missing data points. We demonstrate that ΣXCorr* produces an average reduction in false positive identifications of differential expression of 25% compared to ΣXCorr. Conclusion <it>ProtQuant </it>is a tool for protein quantification built for multi-platform use with an intuitive user interface. <it>ProtQuant </it>efficiently and uniquely performs label-free quantification of protein datasets produced with Sequest and provides the user with facilities for data management and analysis. Importantly, <it>ProtQuant </it>is available as a self-installing executable for the Windows environment used by many bench scientists.</p

A mass accuracy sensitive probability based scoring algorithm for database searching of tandem mass spectrometry data

Author: A Keller
AG Sullivan
AI Nesvizhskii
B Paizs
BT Hansen
DF Hunt
DF Hunt
DN Perkins
EA Kapp
EN Nikolaev
Hua Xu
J Colinge
JEP Syka
JK Eng
JV Olsen
K Biemann
KG Standing
KR Clauser
LY Geer
M Havilio
M Mann
Michael A Freitas
MJ MacCoss
N Zhang
R Bakhtiar
RG Sadygov
RG Sadygov
RG Sadygov
V Bafna
V Dancík
W Qian
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background Liquid chromatography coupled with tandem mass spectrometry (LC-MS/MS) has become one of the most used tools in mass spectrometry based proteomics. Various algorithms have since been developed to automate the process for modern high-throughput LC-MS/MS experiments. Results A probability based statistical scoring model for assessing peptide and protein matches in tandem MS database search was derived. The statistical scores in the model represent the probability that a peptide match is a random occurrence based on the number or the total abundance of matched product ions in the experimental spectrum. The model also calculates probability based scores to assess protein matches. Thus the protein scores in the model reflect the significance of protein matches and can be used to differentiate true from random protein matches. Conclusion The model is sensitive to high mass accuracy and implicitly takes mass accuracy into account during scoring. High mass accuracy will not only reduce false positives, but also improves the scores of true positive matches. The algorithm is incorporated in an automated database search program MassMatrix.</p

KnowledgeBank at OSU

Proteomic analysis of the Plasmodium male gamete reveals the key role for glycolysis in flagellar motility.

Author: A Alexa
A Bairoch
A Bernsel
A Bernsel
A Creasey
A Garg
A Krogh
AB Vaidya
AE Lobley
AJ Link
AL Beetsma
AM Talman
Andrea Ecker
Arthur M Talman
B Maclean
B Wickstead
B Wickstead
BF Mitchell
BH Gibbons
Ceereena Ubaida-Mohien
CJ Brokaw
CJ Janse
CL Gatlin
CS Yu
CU Mohien
DA van Schalkwyk
David R Graham
DL Tabb
DM Baron
DN Perkins
E Deligianni
EF Smith
Georges K Christophides
J Stumpff
JA Vizcaino
JE Elias
John R Yates
Judith H Prieto
K Lal
K Lal
K Pfister
K Slavic
KC Chou
L Fox
LA Kelley
LJ Briggs
LY Geer
M Aikawa
M Aikawa
M Aikawa
M Bern
M Krisfalusi
M Oberholzer
M Punta
M Sakato
Mara Lawniczak
Mark N Wass
Michael JE Sternberg
ML Gupta
MN Wass
MP Washburn
MR van Dijk
N Edwards
N Hall
N Okamoto
O Billker
P Horton
P Yang
PC Bradbury
R Carter
R Craig
R Tewari
R Yokoyama
RC Gentleman
RE Sinden
RE Sinden
RE Sinden
RE Sinden
Rebecca S Stanway
Rhoel R Dinglasan
RM Tombes
Robert E Sinden
Roland Frank
S Briesemeister
S Eksi
S Eksi
S Hunter
S Tanner
Sanjeev Krishna
Sara Marques
SC Dawson
SF Altschul
SM Khan
SM King
T Hawkins
T Joet
Tao Xu
U Straschil
W Daher
WCL Ford
Y Benjamini
YJ Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

BACKGROUND: Gametogenesis and fertilization play crucial roles in malaria transmission. While male gametes are thought to be amongst the simplest eukaryotic cells and are proven targets of transmission blocking immunity, little is known about their molecular organization. For example, the pathway of energy metabolism that power motility, a feature that facilitates gamete encounter and fertilization, is unknown. METHODS: Plasmodium berghei microgametes were purified and analysed by whole-cell proteomic analysis for the first time. Data are available via ProteomeXchange with identifier PXD001163. RESULTS: 615 proteins were recovered, they included all male gamete proteins described thus far. Amongst them were the 11 enzymes of the glycolytic pathway. The hexose transporter was localized to the gamete plasma membrane and it was shown that microgamete motility can be suppressed effectively by inhibitors of this transporter and of the glycolytic pathway. CONCLUSIONS: This study describes the first whole-cell proteomic analysis of the malaria male gamete. It identifies glycolysis as the likely exclusive source of energy for flagellar beat, and provides new insights in original features of Plasmodium flagellar organization

Oxford University Research Archive

Kent Academic Repository

Spiral - Imperial College Digital Repository

Online-Publikations-Server der Universität Würzburg

Bern Open Repository and Information System (BORIS)

St George's Online Research Archive

ETISEQ – an algorithm for automated elution time ion sequencing of concurrently fragmented peptides for mass spectrometry-based proteomics

Author: A Keller
A Schlosser
AA Ramos
AB Chakraborty
Alexander B Schwahn
BF Cravatt
C Baumgartner
DS Moore
FW McLafferty
HB Liu
J Castro-Perez
J Malmström
Jason WH Wong
JB Fenn
JD Venable
JE Elias
JK Eng
JWH Wong
JWH Wong
Kevin M Downard
LF Wu
LY Geer
MP Washburn
N Mujezinovic
QZ Hu
R Aebersold
R Craig
RA Zubarev
S Purvine
S Tanner
SP Gygi
WDv Dongen
WH Press
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Concurrent peptide fragmentation (i.e. shotgun CID, parallel CID or MSE) has emerged as an alternative to data-dependent acquisition in generating peptide fragmentation data in LC-MS/MS proteomics experiments. Concurrent peptide fragmentation data acquisition has been shown to be advantageous over data-dependent acquisition by providing greater detection dynamic range and providing more accurate quantitative information. Nevertheless, concurrent peptide fragmentation data acquisition remains to be widely adopted due to the lack of published algorithms designed specifically to process or interpret such data acquired on any mass spectrometer. Results An algorithm called Elution Time Ion Sequencing (ETISEQ), has been developed to enable automated conversion of concurrent peptide fragmentation data acquisition data to LC-MS/MS data. ETISEQ generates MS/MS-like spectra based on the correlation of precursor and product ion elution profiles. The performance of ETISEQ is demonstrated using concurrent peptide fragmentation data from tryptic digests of standard proteins and whole influenza virus. It is shown that the number of unique peptides identified from the digests is broadly comparable between ETISEQ processed concurrent peptide fragmentation data and the data-dependent acquired LC-MS/MS data. Conclusion The ETISEQ algorithm has been designed for easy integration with existing MS/MS analysis platforms. It is anticipated that it will popularize concurrent peptide fragmentation data acquisition in proteomics laboratories.</p

HKU Scholars Hub

Identification of alternative splice variants in Aspergillus flavus through comparison of multiple tandem MS search algorithms

Author: A Marchler-Bauer
AI Nesvizhskii
AJ Link
BC Searle
BM Balgley
C Barber
C Hughes
David C Muddiman
DL Tabb
DN Perkins
DR Georgianna
EA Kapp
H Choi
HM Holden
J Cox
JB Thoden
JE Elias
JK Eng
Kung-Yen Chang
KY Chang
L Florea
L Käll
LY Geer
M Margulies
MN Bainbridge
MP Washburn
N Edwards
R Craig
RG Sadygov
S Heber
S Tanner
SF Altschul
W Yu
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Database searching is the most frequently used approach for automated peptide assignment and protein inference of tandem mass spectra. The results, however, depend on the sequences in target databases and on search algorithms. Recently by using an alternative splicing database, we identified more proteins than with the annotated proteins in <it>Aspergillus flavus</it>. In this study, we aimed at finding a greater number of eligible splice variants based on newly available transcript sequences and the latest genome annotation. The improved database was then used to compare four search algorithms: Mascot, OMSSA, X! Tandem, and InsPecT. Results The updated alternative splicing database predicted 15833 putative protein variants, 61% more than the previous results. There was transcript evidence for 50% of the updated genes compared to the previous 35% coverage. Database searches were conducted using the same set of spectral data, search parameters, and protein database but with different algorithms. The false discovery rates of the peptide-spectrum matches were estimated < 2%. The numbers of the total identified proteins varied from 765 to 867 between algorithms. Whereas 42% (1651/3891) of peptide assignments were unanimous, the comparison showed that 51% (568/1114) of the RefSeq proteins and 15% (11/72) of the putative splice variants were inferred by all algorithms. 12 plausible isoforms were discovered by focusing on the consensus peptides which were detected by at least three different algorithms. The analysis found different conserved domains in two putative isoforms of UDP-galactose 4-epimerase. Conclusions We were able to detect dozens of new peptides using the improved alternative splicing database with the recently updated annotation of the <it>A. flavus </it>genome. Unlike the identifications of the peptides and the RefSeq proteins, large variations existed between the putative splice variants identified by different algorithms. 12 candidates of putative isoforms were reported based on the consensus peptide-spectrum matches. This suggests that applications of multiple search engines effectively reduced the possible false positive results and validated the protein identifications from tandem mass spectra using an alternative splicing database.</p

Harvest: an open-source tool for the validation and improvement of peptide identification metrics and fragmentation exploration

Author: A Cooksey
A Keller
AA Klammer
AC Sauve
B Domon
BE Frewen
C Zhou
David N Perkins
DC Chamrad
EA Kapp
EC Huang
J Colinge
J Peng
J Samuelsson
JA Falkner
JE Elias
JK Eng
Jonathan W Arthur
JWH Wong
KA Resing
L McHugh
Leo C McHugh
LY Geer
M Bern
M Brosch
M Mann
N Zhang
R Aebersold
RE Higgs
RJ Arnold
S Garbisa
VH Wysocki
W Zhang
Z Zhang
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Protein identification using mass spectrometry is an important tool in many areas of the life sciences, and in proteomics research in particular. Increasing the number of proteins correctly identified is dependent on the ability to include new knowledge about the mass spectrometry fragmentation process, into computational algorithms designed to separate true matches of peptides to unidentified mass spectra from spurious matches. This discrimination is achieved by computing a function of the various features of the potential match between the observed and theoretical spectra to give a numerical approximation of their similarity. It is these underlying "metrics" that determine the ability of a protein identification package to maximise correct identifications while limiting false discovery rates. There is currently no software available specifically for the simple implementation and analysis of arbitrary novel metrics for peptide matching and for the exploration of fragmentation patterns for a given dataset. Results We present Harvest: an open source software tool for analysing fragmentation patterns and assessing the power of a new piece of information about the MS/MS fragmentation process to more clearly differentiate between correct and random peptide assignments. We demonstrate this functionality using data metrics derived from the properties of individual datasets in a peptide identification context. Using Harvest, we demonstrate how the development of such metrics may improve correct peptide assignment confidence in the context of a high-throughput proteomics experiment and characterise properties of peptide fragmentation. Conclusions Harvest provides a simple framework in C++ for analysing and prototyping metrics for peptide matching, the core of the protein identification problem. It is not a protein identification package and answers a different research question to packages such as Sequest, Mascot, X!Tandem, and other protein identification packages. It does not aim to maximise the number of assigned peptides from a set of unknown spectra, but instead provides a method by which researchers can explore fragmentation properties and assess the power of novel metrics for peptide matching in the context of a given experiment. Metrics developed using Harvest may then become candidates for later integration into protein identification packages.</p

Public Library of Science (PLOS)

Sydney eScholarship

Disulphide Bridges of Phospholipase C of Chlamydomonas reinhardtii Modulates Lipid Interaction and Dimer Stability

Author: A Dabdoub
AU Singer
B Ananthanarayanan
B Miao
BL Blazer-Yost
C Eichwald
C Shao
DW Heinz
EJ Kaftan
Eric Cascales
G Du
G Romero
HF Paterson
JM Han
JW Lomasney
Jyoti Batra
K Tamura
KC Chou
LM Quarmby
LY Geer
M Katan
M Nomikos
M Pu
MA Larkin
Mayanka Awasthi
MJ Berridge
MJ Berridge
MR Hokin
MV Ellis
N Dekker
P Fariselli
RL Kingma
S Dowler
SA Arisz
SG Rhee
Suneel Kateriya
T Castrignano
T Moriyama
V Raussens
X Zhang
Y Banno
Publication venue: Public Library of Science
Publication date: 21/06/2012
Field of study

BACKGROUND: Phospholipase C (PLC) is an enzyme that plays pivotal role in a number of signaling cascades. These are active in the plasma membrane and triggers cellular responses by catalyzing the hydrolysis of membrane phospholipids and thereby generating the secondary messengers. Phosphatidylinositol-PLC (PI-PLC) specifically interacts with phosphoinositide and/or phosphoinositol and catalyzes specific cleavage of sn-3- phosphodiester bond. Several isoforms of PLC are known to form and function as dimer but very little is known about the molecular basis of the dimerization and its importance in the lipid interaction. PRINCIPAL FINDINGS: We herein report that, the disruption of disulphide bond of a novel PI-specific PLC of C. reinhardtii (CrPLC) can modulate its interaction affinity with a set of phospholipids and also the stability of its dimer. CrPLC was found to form a mixture of higher oligomeric states with monomer and dimer as major species. Dimer adduct of CrPLC disappeared in the presence of DTT, which suggested the involvement of disulphide bond(s) in CrPLC oligomerization. Dimer-monomer equilibrium studies with the isolated fractions of CrPLC monomer and dimer supported the involvement of covalent forces in the dimerization of CrPLC. A disulphide bridge was found to be responsible for the dimerization and Cys7 seems to be involved in the formation of the disulphide bond. This crucial disulphide bond also modulated the lipid affinity of CrPLC. Oligomers of CrPLC were also captured in in vivo condition. CrPLC was mainly found to be localized in the plasma membrane of the cell. The cell surface localization of CrPLC may have significant implication in the downstream regulatory function of CrPLC. SIGNIFICANCE: This study helps in establishing the role of CrPLC (or similar proteins) in the quaternary structure of the molecule its affinities during lipid interactions